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Abstract 


iK 


-A  l  \X^' 


To  solve  the  system  of  linear  equations  Aw  *  r  that  arises  froa  the 

/ 

discretization  of  a  two-dimensional  self-adjoint  elliptic  differential 
equation,  iterative  methods  employing  easily  computed  incomplete 
factorizations,  LU  «=  A+B,  are  frequently  used,  Dupont,  Kendall,  and 
Rachford  [5]  showed  that,  for  the  DKR  factorization,  the  number  of 
iterations  {arithmetic  operations}  required  to  Reduce  the  A-nora  of  the 

-1/2  ^1  -2V  ^ 

error  by  a  factor  of  e  is  0(h  log-0  (0(h  21og-}J,  where  h  is  the 

8  6 


stepsize  used  in  the  discretization. 


present  aome  error  estimates  which 


suggest  that,  if  a  pair  of  Alternating-Direction 


Factorizations  are 


used,  then  the  nuaber  of  iterations (arithmetic  operations}  may  be 
-1/3  -2^  1  1 

decreased  to  0(h  ,  log-)  {0(h  31og-)}.  numerical  results  supporting  this 


estimate  are  included,. 
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1.  Introduction. 

Iterative  aethods  ere  frequently  need  to  solve  the  systea  of  linear 
equations 

Aw  -  r  (1.1) 

that  arises  froa  the  nsnal  five-point  discretization  of  the  Dirichlet 
problea  for  the  two-diaensional  self-adjoint  elliptic  differential  equation 


a_ 

a  & 

+  2_r 

a  **  1 

9x 

.1  9x  . 

3y  l 

2  3y  J 

f  in  0. 


(1.2) 


where,  throughout  this  paper,  we  assuae 

2 

1.  0  is  an  open  bounded  region  in  R  , 

2.  aj,  *2  *r0  Lipschitz  continuous  in  0, 

3-  Sj,  ij  2,  n  >  0  in  Q  for  soae  constant  q,  and 
4.  q  £  0  is  bounded  in  Q. 

The  efficiency  of  aany  iterative  aethods  depends  upon  the  selection  of  an 
easily-inverted  approxiaation  A  to  A.  Several 

authors  [2,  4,  5,  7,  9,  10,  11,  12,  13]  have  suggested  taking  A  to  be  an 
incoaplete  faotorization  of  A, 


A  «  LU  -  A+B, 


(1.3) 


where  B  is  ohosen  so  that  L  and  D  are  sparse. 


For  several  of  these  factorizations,  there  are  two  directionally 
dependent  foras  of  B:  B^  and  Bg.  Stone  [13]  found  that,  for  his  aethod. 
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experimental  reanlta  indicated  that  using  the  pair  of  incomplete 
factorisations  alternately, 

(A+B,)w  J.  -  (A+B,)w  -  h(Aw  -  r)  (1.4) 

1  n^  In  n 

(A+Wl  "  (A+B2)wn+i  "  #(AV±  "  r> ' 

mm  mm 

gave  a  faster  rate  of  convergence  than  using  either  A  *  A+B^  or  A  *  A+B^ 
alone  in  the  stationary  iteration 

Aw  -Aw  -  w(Aw_  -  r).  (1.5) 

aTl  si  n 

Of  coarse,  eliminating  w  .1 ,  we  can  rewrite  the  pair  of  equations  (1.4)  in 

n+2 

the  form  (1.5)  using 

A  -  Mm  -  [A+B1][(2-w)A+B1+B2]"llA+B2)  (1.6) 

provided  that  [(2-t»)A+Bj+B2]  is  nonsingular.1  Ve  refer  to  the  right  side  of 
(1.6)  as  an  Alternating-Direction  Incomplete  Factorisation.  Although 
itself  may  be  costly  to  compute,  it  is  relatively  inexpensive  to  solve 
M^x  -  b.  and  it  is  the  solution  of  such  systems  that  is  required  in  the 


1  Note  that  the  formal  inverse  of  M  ,  [A+B,]  1[(2-m)A+B1+B,J [A+B.,1  1,  is 
always  well-defined.  **  *  1 
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2 

iteration  (1.5)  and  its  Chebyshev  or  conjugate  gradient  accelerations. 

In  general,  is  nonsyametric.  Since,  in  many  applications,  it  is 
advantageous  for  A  to  be  symmetric,  ve  also  consider 

C  ■  K1+0-  <»•« 

the  symmetric  part  of  M^1.  Again,  although  itself  nay  be  costly  to 
conpnte,  it  is  relatively  inexpensive  to  solve  S^x  =  b. 

For  the  DKK  factorization  (an  incomplete  factorization  similar  to 

Stone's),  Dupont,  Kendall,  and  Rachford  [5]  shoved  that  the  number  of 

iterations  of  (1.5)  required  to  reduce  the  A- norm  of  the  error  by  a  factor 

of  s  is  0(h  1log-)  and  the  associated  number  of  arithmetic  operations  is 
8 

-3  1 

0(h  log-).  Moreover,  the  iteration  can  be  accelerated  by  Chebyshev  or 
8 

conjugate  gradient  methods,  decreasing  the  the  number  of  iterations 
-1/2  1 

required  to  0(h  log-)  and  the  associated  number  of  arithmetic  operations 

_21  1 

to  0(h  21og~).  In  this  paper,  ve  investigate  vhether  these  vork  estimates 
8 

can  be  improved  by  using  either  the  Alternating-Direction  form  (1.6)  of  the 
DIE  factorization  (AD-DKR)  or  the  Symmetric  Alternating-Direction  form 
(1.7)  of  the  DKR  factorization  (SAD-DKR). 


M  can  be  vieved  as  a  one  parameter  family  of  preconditionings  for 
A.  From  this  point  of  viev,  it  follovs  that,  vhen  (1.5)  is  accelerated  by 
the  Chebyshev  or  conjugate  gradient  technique,  the  parameter  m  internal  to 
should  be  held  fixed,  vhile  the  external  parameter  m  in  (1.5)  is  varied. 
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la  Section  2,  we  review  the  DKR  factorization  end  present  a 

mod  if ication.  In  Section  3,  we  review  tone  general  results  concerning  the 

rate  of  convergence  of  the  stationary  iteration  (1.5)  and  its  Chebyshev  or 

conjugate  gradient  acceleration.  Since  these  results  are  dependent  npon 

the  spectra  of  A  *A,  we  are  led  to  an  investigation  of  the  eigenvalues  of 

M^A  and  S^A  1b  the  following  two  sections.  More  specifically,  in 

Section  4,  using  the  additional  restriction  that  a^  <=  a^,  we  develop 

eigenvalue  estimates  for  a  pair  of  factors  of  the  iteration  matrix  I-hM^A 

associated  with  the  aiodified  AD-DtR  factorisation.  In  Section  5,  we 

explain  why  we  believe  that,  for  a  large  class  of  problems,  these  estimates 

suggest  thst  the  number  of  iterations  of  (1.5)  required  to  reduce  the 

-2/3  1 

A- norm  of  the  error  by  a  factor  of  e  nay  be  0(h  log-)  with  the 

—2—  1 

associated  number  of  arithmetic  operations  being  0(h  31og-),  and, 

moreover,  if  (1.5)  is  accelerated  by  the  Chebyshev  or  conjugate  gradient 

-1/3  1 

methods,  then  the  number  of  iterations  may  be  decreased  to  0(h  log-) 

-2*  1  * 

with  the  associated  number  of  arithmetic  operations  being  0(h  31og~). 
Although  these  work  estimates  are  not  rigorous,  numerical  results  presented 
in  Section  6  strongly  support  our  conjecture  thst  the  estimates  are 
accurate.  In  addition,  the  numerical  results  indicate  that  the  estimates 
are  valid  for  the  unmodified  as  well  as  the  modified  forms  of  the  DXR 
factorisation. 

2,  The  SO  Factorisation. 

In  this  seotion,  following  the  notation  of  [5],  we  review  the  DKK 
factorisation  and  present  a  modification. 
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Let  be  the  set  of  points  (jh.kh)  e  0,  where  h  is  the  stepsise 
assoeieted  with  the  disoretization  and  j,  k  are  integers,  and  let  be  the 
set  of  points  (jh,kh)  s  6^  such  that  ( < j+1 ) h, kh) ,  ((j-l)h.kh),  (jh, (k+l)h), 
(jh,(k-l)h)  s  also.  Then  dQ^  =  0^\Q^.  Let  w^  k  denote  the  value  of  the 
grid-fnnction  w  at  (jh.kh)  e  5^. 

For  each  point  (jh.kh)  e  Q^,  we  approximate  the  right  side  of  (1.2) 
by  the  usual  five-point  self-adjoint  difference  operator 


(Aw)j.k  "  bj.kwj,k  +  cj,k*j+l,k  +  fj,kwj,k+l 

+  Cj-l,kWj-l,k  +  fj,k-lWj,k-l' 


(2.1) 


For  definiteness,  we  take 


cj>k  -  -lf^Uj+ijh.kh). 


f j  k  -  -h"2a2(jh,(k+|)h). 

bjk  -  h"2[a1((j+i)h.kh)  +  a1((j-i)h,kh) 

+  a2(jh,(k+|)h)  +  a2(jh,(k-|)h)J  -  q( jh.kh) , 

although  our  results  hold  for  other  sisiilar  sets  of  coefficients. 

When  the  linear  difference  operator  A  is  written  in  matrix  fora,  the 

teras  in  (2.1)  that  involve  w^  k  for  (jh.kh)  e  90^  are  incorporated  into 

the  right  side  of  (1.1).  Therefore,  we  adopt  the  convention  that  w  -  0 

J 

if  (jh.kh)  4  For  consistency  of  notation,  we  also  adopt  the 
alternative  eonvention  need  in  [$]  that 
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«jk  “  0  if  (jh.kh)  i  \  or  ( ( j+l)h,kh)  i  and 

-  0  if  (jh.kh)  i  Oh  or  (jh,(k+l)h)  i  Qh. 

Vith  the  letter  convention,  it  is  nsefnl  to  define 
cJ>k  -  -h~2e1((j+|)h.kh), 

*j#k  »  -h~2e2(jh.(k+|)h). 


4j#k  *  q< jh.kh), 
for  (jh.kh)  s  0. 

In  [S],  Dupont,  Kendall,  and  Rachford  introduced  the  DKR 
factorization 

-  A+Bx  with  Bx  « 

where 


Vw-Vj,!" 


(1,w  +  t(1)  w  +  «(1)  . 

j.kj.k  j-l,kwj-l,k  •j.k-l’j.k-l' 


(B  w)  -  »,(1>  .  fc(l) 

<B1  J*k  hj,kWj-l.k+l  +  Vl.k-l*j+l,H 


-  M.(1)  ♦  *<*>  » 

Uj.k  hj+l.k-l)wj,k' 


(2.2) 

(2.3) 

(2.4) 

(2.5) 


with  coefficients  given  bp 
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-  *vUi 

-  <*£./  -  <«i:L>2>,/2. 


(2.6) 


(1)  _  .  Cl) 

*j#k  fj,k/vj,k* 


(2.7) 


.  (1)  ,  Cl) 

j.k  cj,k/vj,k' 


(2.8) 


-«>  -  t(1)g(1)  -  c 


f,  w/(v(/i)2. 


>l.k  M.k’j.k  -j.k‘j.k''wj,k 


(2.9) 


Since  k  end  f^  k  ere  zero  for  (jh.kh)  e  30^,  the  coefficients  of  the 
feetorizetion  cen  be  computed  recursively  for  j  end  k  increesing.  Ve 

3 

modify  this  formuletion  by  teking 


t(D  ~  f  h  _OK2 

hj+l,k  j,k£j,k/(  j.k*  * 


(2.10) 


end  initielizing 


-'&■ -  ♦V1*'2 


(2.11) 


Note 
edj  ecent 


thet  the  recurrence  (2.10)  differs  from  (2.9)  only  et  the  points 
to  8fi^,  where  c^  k  or  f j  k  mey  be  zero  but  c^  k  end  fj  k  ere  not. 


Dupont 
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if  (jh,kh)  e  dQ^  and  either  ((j+l)h,kh)  e  or  ( jh,<k+l)h)  e  Q^. 

Kendall,  and  Kachford  [S]  showed  that,  for  the  onnodified  factorization, 

the  quantity  under  the  square  root  on  the  right  side  of  (2.6)  is  positive,. 

whence  L^L^  is  syaaaetric  and  positive-definite.  Leona  4.1  proves  that  the 

modified  DKR  factorization  possesses  these  properties  also.  However,  for 

(1) 

y  ,  .  <  ®,  the  modification  has  the  effect  of  making  Bi  negative-definite 
rather  that  simply  negative-semidef inite ,  as  is  the  case  for  the  unmodified 
factorization.^  This  difference  is  critical  to  the  eigenvalue  estimates 
developed  in  Section  4. 


If  the  grid-points  are  renumbered  with  j  decreasing  and  k  increasing, 
then  an  alternative  form  of  the  DKR  factorization  is  given  by 


hH 


A+B„ 


with 


S  '  W 


where 


‘V’j.i 


v(2).  ♦  t<2>  .  *  ,<2) 

vj.kWj,k  Vl.k  j+l,k  8 2 


j,k-lwj,k-l' 


(2.12) 


(2.13) 


4  ** 

The  coefficients  c.  ^  and  f,  .  used  in  (2.11)  do  not  occur  in  the 
matrix  A.  Moreover,  for'some  domains  Q  and  their  discretizations,  the 
compotation  of  these  coefficients  may  require  the  evaluation  of  a^and  a^, 
respectively,  ontside  of  0.  If  this  presents  a  problem,  c  and  f.  .  may 
be  replaced  by  nearby  nonzero  values.  In  most  instances,  this  alteration 
does  not  affect  the  factorization  significantly. 

s  ** 

A  veotor  w  with  all  components  equal  is  a  null-vector  for  the  matrix  B^ 
associated  with  the  unmodified  DKR  factorization. 
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(B,w)  .  *  ...  +  h^.2]  ,  w. 

2  j,k  j,k  j+l,k+l  j-l,k-l  j-l,k-l 


(2.14) 


(2) 

(D,w)  •  V  =  °-  Vb4  VW4  V» 
2  J.k  J.k  j.k  j.k 


(2.15) 


vitk  coefficients  for  the  unmodified  factorization  given  by 


(2)  rv  ,,^.(2K  v<2) 

v.  .  =  [b.  .  (1+a .  .)  -  h .  .  -  h. 

j»k.  j.k  j.k  j.k  j-l.k-1 

ft(2)  \2  -  („<2>  )h1/2 

(tj+l.k)  (8j.k-l>  1  ‘ 


(2.16) 


g(2)  -  f  /v<2). 
gj.k  \j.k  j.k' 


(2.17) 


t^2)  -  c  /v^2^ 

* j.k  Cj-l.k7  j.k* 


(2.18) 


.  (2)  .  (2)  (2)  ..  (2K2 

Vl.k  "  ‘j.k^.k  Cj-l.kfj.k/(vj.k)  * 


(2.19) 


For  the  modified  factorization,  we  replace  (2.19)  by 


.(2)  -  ~  7  J2K2 

hj-l.k  °J-l,kfj,k/<Tj,k)  ’ 


'2.20) 


and  initialize 


(2) 


(2),~ 


,1/2 


’j.k  “  ^j.^Vl.k  +  fj.k>1  (2*21) 

if  ( jh,kh)  a  aflh  and  either  ((j-l)h.kh)  e  or  (jh.(k+l)h)  a  Qh.  For 


either  the  modified  or  unmodified  factorizations,  the  coefficients  can  be 


computed  recursively  with  j  decreasing  and  k  increasing. 

Again,  is  symmetric  and  positive-definite  for  both  the  modified 

(2) 

and  unmodified  factorizations.  Furthermore,  for  y .  <  «,  the  modification 

J  ** 

has  the  effect  of  making  negative-definite  rather  that  simply  negative- 
semidefinite,  as  is  the  case  for  the  unmodified  factorization. 

We  end  this  section  with  a  remark  about  the  directional  dependence  of 
the  factorizations  (2.2)  and  (2.12).  Not  only  are  the  coefficients 
computed  in  a  different  order,  but,  also,  resembles  a  second-order 
difference  operator  with  differences  taken  along  lines  x+y=c ,  while  B^ 
resembles  a  similar  operator  with  differences  taken  along  lines  x-y=c. 

3.  Error  Estimates. 

In  this  section,  we  review  some  general  results  concerning  the  rate 
of  convergence  of  the  stationary  iteration  (1.5)  and  its  Chebyshev  or 
conjugate  gradient  acceleration. 

To  begin,  we  introduce  some  additional  notation.  If  z  =  (z.,...,z  ) 

x  n 

and  y  =  (y^,...,yn)  are  two  n-vectors,  let  the  inner-product  of  z  and  y  be 
(x,y)  =  Xiy^. .  •+xnyn*  where  yj^  is  the  complex  conjugate  of  y^  Let  the 
norm  of  z  be  |x||  =  (x,x)  and,  for  any  n  by  n  matrix  C,  let  the  norm  of  C 
be  ICII  ■  max{  IlCzil  :  Ixll  =  1  }.  For  any  symmetric  positive-definite  matrix 
P,  let  the  P-norm  of  x  be  Hxllp  =  (Px.x)  and  the  P-norm  of  C  be 

■Clip  "  max{  IICxHp  :  Ixllp  ■  1  }.  Also,  for  any  matrix  C,  let  the  spectral 
radius  of  C  be  p(C)  ■  max{  111  :  X  and  eigenvalue  of  C  ) . 
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If  w  is  the  eolation  of  (1.1),  is  the  n**1  iterate  generated  by  the 
stationary  iteration  (1.5),  and  e  =  w  -  w  is  the  error  in  the  n**1 

n  ti 

iterate,  then 


e  =  II-«r1A]e  ,  =  [I-a>A  1A]neft, 
n  n-i  u 


(3.1) 


where  e^  is  the  error  in  the  initial  guess  w^.  Since  A  is  syuuaetric  and 

1/2 

positive-definite,  it  is  valid  to  multiply  (3.1)  by  A  to  get 


.1/2  .1/27-1.1/2. .1/2 
A  e  =  [I-«A  A  A  ]A  e  , 
it  n-l 


fT  . l/27-l. 1/2. n. 1/2 
=  [I-wA  A  A  J  A  e^. 


whence 


He  H.  <  ll[I-«A1/2r1A1/2)nIMIeJ.. 

HA  U  A 


(3.2) 


The  last  inequality  is  the  basis  for  the  following  lemma. 

1 /2#^i  1/2 

Lemma  3.1:  If  p  *  p(I-«A  A  A  )  <  1,  then  the  number  of 
iterations  of  (1.5)  required  to  reduce  the  A- norm  of  the  initial  error  by  a 
factor  of  e  is  at  most  n+1,  where 

(n-q)  log  i  -  log  j“j  -  log  j  +  log  c,  (3.3) 

c  is  a  positive  constant, ^  and  q+1  is  the  size  of  the  largest  Jordan  block 


The  dependent  upon  the  similarity  transformation  that 

reduces  A  ik  1A  to  Jordan  normal  form  (see  Theorem  3.1  of  [14]). 
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of  KinrXk112  with  an  eigenvalue  of  magnitude  p.  If  A  1^A  2A2^2  ia 
normal ,  then 

n  -  log  J  /  log  i.  (3.4) 

Proof:  By  Theorem  3.1  of  [14]. 

for  conatants  c,  q  and  p  specified  above.  This  inequality,  together  with 
(3.2),  proves  the  validity  of  (3.3).  If  A2^2A  2A2^2  is  noraal,  then 
I-oA2^2A  2A2^2  can  be  diagonalized  by  a  Hermit ian  similarity 
transformation,  whence 

Ki-«A1/2r1A1/Vi  -  pn 

and  the  well-known  result  (3.4)  follows.  Q.B.D. 


If  (1.3)  is  accelerated  by  the  Chebyshev  technique,  then  the  error  at 
the  ntl1  step  satisfies 

e  -  P  (A~2A)eft,  (3.5) 

a  a  u 

where  P  (z)  is  the  translated  and  normalized  Chebyshev  polynomial  of  degree 
A 

1/2 

n.  (See,  for  ezaaple,  [1].)  Multiplying  (3.5)  by  A  and  taking  norms,  we 
get  that 

He^A  i  lPn(A1/2r1A1/2)|.|e0lA. 


(3.6) 


The  last  inequality  is  the  busie  of  the  following  leai. 

Lem  3*2:  If  the  eigenvalue*  of  A*^A  *A*^  lie  in  the  ellipse 

E={z*C:i“l-a  cos  6  +  i  b  sin  0,  0  £  8  £  2n  },  (3.7) 

where  0  £  b  <  a  <  1,  then  the  number  of  iterations  of  the  Chebyshev 
acceleration  of  (1.5)  required  to  reduce  the  A-nora  of  the  initial  error  by 
a  factor  of  a  is  at  most  n+1,  where 

n  log  -  -  q  log  n  ■  log  |  +  log  c,  (3.8) 

7 

c  is  a  positive  constant,  q+1  is  the  size  of  the  largest  Jordan  block  of 
A-*  A  with  an  eigenvalue  on  the  ellipse  E,  and 

r  -  (a  +  b)/(l  +  /l  -  a2  +  b2  ).  (3.9) 

If  A  ia  symmetric,  then 

n  *  log  ~  /  log  (3.10) 

and,  amreover,  b  *  0  in  the  expression  for  r. 

Proof:  By  inequality  (2.22)  of  [8],  Section  6.2  of  [1],  and  an 

The  conatant.c  la. dependent  upon  both  the  siailarity  transformation 
that  reduees  A1-  A  1A1' "  to  Jordan  normal  fora  (see  Theorea  3.1  of  [14]) 
and  the  bound  (2.22)  of  [8]  on  the  Chebyshev  polynomials. 


argument  similar  to  the  one  leading  to  Theorea  3.1  of  [141 


IP  (A1/2r1A1/2)|  i  cnqrn  (S.11! 

& 

for  the  constants  c,  q,  and  r  specified  above.  This  inequality,  together 

with  (3.6),  proves  the  validity  of  (3.8).  If  A  is  syaaetric,  then 
l/2*—l  1/2 

A  A  A  has  real  eigenvalues  and,  moreover,  it  can  be  diagonalized  by 
an  orthogonal  similarity  transformation.  Hence,  it  follows  from  a 
simplification  of  the  argument  used  to  prove  (3.11)  that 

|p  (A1/2r2A1/2)l  i  2rn, 

A 

where  b  *  0  in  the  expression  for  r.  This  together  with  inequality  (3.6) 
proves  the  validity  of  (3.10).  Q.B.D, 


Although  variants  of  the  conjugate  gradient  algorithm  have  been 
developed  for  nonsymmetric  problems,  the  analysis  of  these  methods  is  not 
well-developed.  Consequently,  we  limit  our  discussion  of  the  conjugate 
gradient  acceleration  of  (1.3)  to  the  case  that  A  is  symmetric  and 
positive-definite.  In  this  case,  it  is  well-known  that  the  approximate 
solution  wq  generated  by  the  conjugate  gradient  acceleration  of  (1.5) 
minimizes  the  A-norm  of  the  associated  error,  e  ,  over  all  possible  errors 

A 

of  the  form 


A 


0' 


where  p  (a)  is  a  polynomial  of  degree  n  satisfying  p  (0)  *  1.  (See,  for 

A  A 

example,  [1].)  Sinoe  the  translated  and  normalised  Chebyshev  polynomial. 


P  (z),  satisfies  these  conditions,  the  following  leans  is  sn  inaediate 
& 

consequence  of  Leans  3.2. 

Leans  3.3:  If  A  is  syaaetric  and  the  eigenvalues  of  A*^2A  *A*^2  lie 
in  the  interval  [l-a,l+a] ,  0  £  s  <  1,  then  the  nuaber  of  iterations  of  the 
conjugate  gradient  acceleration  of  (1.5)  required  to  reduce  the  A-nom  of 
the  initial  error  by  a  factor  of  c  is  at  nost  n+1,  where 

n  “  log  ~  /  log  -  (3.12) 

and 

r  -  a/  (1  +  Jl  -  a2)  . 


To  use  the  results  developed  in  this  section  to  bound  the  nunber  of 
iterations  of  (1.5)  or  its  acceleration,  we  require  estiaates  of  the 
speetrua  of  A  *A.  We  turn  to  this  question  next. 

4.  Eigenvalue  Estiaates. 

Por  the  AD-DKR  factorization,  the  iteration  aatrix  associated  with 
(1.5)  is 

I-aE^A  -  [A+B1)”1[(l-a)A+B1][A+B23~1I(l-a)A+B2), 
which  is  sinilar  to 

Kl-nJA+Bj] [A+B2]‘1[(l-a)A+B23 [A+B^-1 . 
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la  this  section,  vs  develop  soae  eigenvalue  estimates  for  the  pair  of 
factors  I(1-»»)A+B^]  [A+Bj]  *  and  Kl-tOA+B^l  [A+B^l  *.  These  estinates 
provide  soae  gnidance  (which  has  proven  to  be  very  effective  in  practice) 
for  choosing  the  paraaeters  {a ' and  u  required  by  the  AD-DKR  and  SAD-DKR 
factorisations.  Moreover,  these  estiaates  are  the  basis  for  the 
conjectures  developed  in  the  next  section  concerning  the  work  required  to 
solve  (1.1)  to  a  specified  tolerance. 

A  nuaber  of  preliminary  leamas  are  required  before  we  can  state  and 
prove  the  aain  result  of  this  section. 


4.1:  For  either  fora  of  the  modified  DKR  factorization  (2.2) 

or  (2.12),  if  l  0  and  0.  ±  <  ».  i-1,2,  where 

Ji*  *  J  »x 

0*  -  ain  {  |[(l4«j|>>(l+p<|*)  +  l{(l+oj^)(l+p^>>2-4p*^]1/2)  } 


(1)  _  CJ-l.fc+fiA-X 


j.k 


cj,k+fj,k 


(2)  .  llallilalzi 

i.k  ~  ~  * 

°j-l.k+fj,k 


then 


(4.1) 


<Tj,k)  *  "P2(cj-l,k  +  ^j.k*1 


(4.2) 


a  °j,k+ij,k 


(4.3) 
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o  i  h{2)  <  -  ^ 

1  Vl.k  1  P2~  +? 

*  Cj-1 ,k  ‘j.k 


(4.4) 


at  all  points  (jh.kh)  e  0.  at  which  and  h^|.  are  required.  Moreover, 

IX  J  f  C  J|* 

if  the  user  selected  paraaeters  and  are  uniforaly  bounded 

j»*  j»» 

above  independently-  of  the  stepsize,  h,  then 


o  <  ,-Ja  1 


(4.5) 


where  H,  although  problea  dependent,  is  independent  of  the  stepsize,  h. 


Proof:  We  prove  this  leaaa  for  the  factorization  (2.2)  only,  as  the 
proof  for  (2.12)  is  siailar. 

Since  the  initial  values  of  v*,1?  and  h^l  .  for  the  aodified  DKR 
factorization  satisfy  (4.1)  and  (4.3)  and  the  basic  recurrence  relations 
used  to  calculate  the  coefficients  for  both  the  original  and  aodified  DKR 
factorizations  are  essentially  the  sane,  the  induction  arguaent  used  in 
Leaaa  1  of  [5]  also  proves  (4.1)  and  (4.3). 


To  prove  (4.5),  note  that,  if  v'  is  coaputed  by  the  recurrence 

J,t 


Inequality  (4.5)  does  not  hold  near  90^  for  the  unaodified  DKR 
factorisation. 


(2.6),  then 


(i), 


(1) 


(4.6) 


where 


M 


j.k 


Cj»k+?j.k 


whence,  by  (2.10),  (2.11),  end  (4.6), 


h 


(1) 

j+l.k 


2. 


M  M 

*j.k+fj.k 


2. 


iin(c 


J.k* 


2. 


where  «  k  ie  either  U^Hl+Pj;^^)  or  yjk  depending  upon  whether 

v*1*  it  cowpwtef.  i»S*  (2.6)  or  (2.11),  respectively.  The  proof  is 

(i)  (i) 

completed  by  observing  that  the  sssnaptions  on  «^k»  TJ>k.  »2  *nd  q 

ensure  that  is  bounded  above  independently  of  the  atepsize,  h.  Q.B.D. 

J«k 


■T1 —  4.2;  For  either  for*  of  the  Modified  DKR  factorization  (2.2) 

or  (2.12),  if  a!1?  I  0,  then  p  2.  1.  Moreover,  if  a**  -  cQhP  for 
J'* 

constants  C()  >  0  and  0  <  p  i  2,  then  ^  2.  l+Cjh5'  for  sosie  constant 


Proof:  The  bound  p2  2.  I  follows  directly  froai  the  definition  of  P2  in 
Leona  4.1.  If,  in  addition,  oj*k  -  cQhp  and  aj,  ^  2.  «l  >  0  are  Lipschitz 
continuous  in  5,  then  Px  2.  l+«1hp/2  by  an  argument  si»ilar  to  the  one  used 
to  prove  (4.15)  in  (5] .  The  corresponding  inequalities  for  P2  are  proved 
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in  a  similar  way. 


Q.E.D. 


4.3:  For  A,  B^,  and  defined  by  (2.1),  (2.4),  and  (2.14), 


respectively. 


(Ax, x) 


-l 


^j.k^j+l.k^j.k1  +  *j,klxj,k+l~xj,k* 

+  qj,klxj,k|2}' 


(4.7) 


(®1X,X>  -  -  J  hj+i,k,xj+i,k~xj,k+il  ' 


(4.8) 


(®2x,x)  "  5  hj,k*xj+l,k+l~xj,k*  * 


(4.9) 


wkere  we  have  used  the  convention  that  x.  .  =  0  for  x.  .  i  Q.  and  the  sows 

j,k  j,k  h 

are  taken  over  all  nonzero  terns. 


Proof:  The  validity  of  equations  (4. 7) -(4. 9)  can  be  demonstrated 
easily  by  summation  by  parts,  as  is  the  validity  of  the  similar  set  of 
equations  (4.7)-(4.8)  in  [5].  Q.B.D. 

Lemma  4.4:  For  either  form  of  the  modified  DEE  factorization  (2.2) 
or  (2.12),  if  o!1!  2;  0  and  fi.  i  L  •.  then 

0  i  -(B  x,x)  i  (Ax,x),  (4.10) 

i  ^ 

If,  in  addition,  the  user-selected  parameters  and  Cy , }  are 

uniformly  bounded  above  independently  of  the  steps ize,  h,  then 


e,(x,x)  £  -(B.x.x) 


(4.11 
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for  sone  constant  > 


0. 


9 


Proof:  Ve  prove  this  leans  for  the  f actorization  (2.2)  only,  as  the 
proof  for  (2.12)  is  similar. 

To  prove  (4.10),  ve  nse  an  argument  similar  to  the  one  used  to  prove 
(4.11)  in  [5].  First,  observe  that  0  <.  -(Efx,x)  follows  directly  from 
(4.8)  of  Leans  4.3,  since  h^]  .  >,  0  by  Lemna  4.1.  To  verify  the  upper 
bound  on  -(B^x,x),  note  that,  by  Leama  3  of  [5], 

^^la-b|2  <  cla-e|2  +  f|b-el2, 

for  any  positive  c,  f  and  any  complex  a,  b,  e.  This  inequality,  together 
with  Lean as  4.1  and  4.3,  shows  that 

""(*!***)  “  I  hj+l,k*xj+l,k"xj,k+l* 


c,  ,_*f. 


<  -  L.  \  J4fc  M.fc  |x  _x  |2 
1  p,  ^  r  .5  j+l,k  j,k+l 
°j.k+tj,k 

1  ^  Uj,klxj+l,k“xj,kl  +  *j.k*xJ.kH"xj,kl  1 


i  7~  (Ax.x) . 


For  the  nnnodified  DER  factorisation,  inequality  (4.11)  does  not  hold, 
whence  is  negative-senide finite  rather  than  negative-definite. 
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To  prove  (4.11),  observe  thet,  by  Lewes  4.1  end  4.3, 

■(V.*>  *  5hjii.k,xj+i.k"xj.k+i|2 

lh  *  ^  *xj+l,k“xj,k+l* 

I***}  }blLi+1)-y{Ll)\2. 

all  L  L 


where  each  L  is  a  diagonal  in  satisfying  x+y°c,  for  some  constant  c,  and 
(yj^)  is  the  subset  of  (x,  .)  on  L.  For  each  L,  let  y.  be  the  n-vector 

j  ,z  L 

with  coaponents  ty£^)  on  the  diagonal  L,  and  let  be  the  n  by  n  aatrix 
h“2dieg(-1.2,-l).  Then 

a+1)  (0,2 


.-2  V  ,  a+1)  (0,2  /r  .  .  .  ,  . 

h  2  ,yL  'yL  1  m  {C\7v7\)  1  X\}7Uy\) ’ 


where  is  the  ainiaiaai  eigenvalue  of  C^.  Since  the  length  of  any  diagonal 
L  in  is  bounded,  there  exists  a  constant  X,  >  0,  independent  of  both  h 
and  L,  such  that  X^  £  h*  >  0.  Consequently,  (4.11)  holds  for  c^  *  HX#  >  0. 

Q.B.D. 


4.5:  If 


.<!) 


1.  0^  ^  ■  Cq**  for  constants  c^  >  0  and  0  <  p  <  2,  and 


*•  *1  '  V 


9  i  -  (ijX.i)  -  (SjZ.x)  £  (l-Cjlip^)<Ax,x). 


then 
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for  soae  constant  >  0. 


Proof:  By  Leasts  4.1  and  4.3. 


0  i  -  (BjX.x)  -  (B^x.x) 


c,  -  •  f 


c.  ,*f. 


<  _  L.  5  \.Jk  .  itl  ,x  _x  ,2  +  lULli+Ut  |x  _x  ,2 

+f  J«.k  J*W  7  ,Xj+l.k+l  xj.k'  ' 


Cj.k+£j.k 


cj,k+fj+l,k 


where  p#  -  ainCp^.p,,}.  For  soae  constant  L, 


.  ^  (1+U)> 
Cj,k+£j+l.k  cj,k+£j.k 


since  tj  2  i)  >  0  i*  Lipschitz  continnons  in  0.  Also,  for  any  coaplez 
▼nines  a,  b.  c,  d. 

I  a— b 1 2  +  lc-dl2  i  I a-c 1 2  +  |a-dl2  +  lb-c|2  +  Ib-d |2 . 


Therefore. 


0  i  -  (Bjz.z)  -  <B2x,x) 


c,  .  •  f 


< .  m*  5  :j -Jai  (U  _x  |2  +  !x  _x  I*, 
1  P,  4  ~  ?  UXj+l,k  Xj.k+l‘  IXj+l.k+l  Xj.k'  ' 


°j  »k+£j  ,k 


<-  111*5  fuClioi  (|x  _x  |2  +  ,x  _x  ,2 

4  P*  4~  +?  '  Xj+l,k  Xj,k  X j . k+1  xj.k‘ 

2  2 

+  *xj+l,k+l_xj,k+l*  +  *xj+l,k+l"xj+l,k*  * 


°j»k+£j»k 
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then  the  AD-DKR  iteration  matrix  M  is  well-defined. 


Proof:  Frew  Lewi  4.5,  if  0  <  m  £  1,  then  t(2-«)A+B1+B2]  is  positive- 
definite,  vhenoe  so  is  t U-hOA+Bj+B^]  .  Therefore,  ((2-w)A+B  +B2]  is 
nonsingnlar  and  N  is  well-defined  by  (1.6).  Q.B.D. 

U 


Theorem  4.7:  Assoae  that 

1.  “  Cghp  *or  con*t*nts  c0  >  0  and  0  <  p  <  2, 

2.  i  i  T*  ^oc  sone  constant  ®  independent  of  the  stepsize, 

h,  and 


3.  ax  =  a2. 

Then  any  eigenvalue  A  of  either  [(l-ajA+B^  [A+B2)  1  or  [(l-t*)A+B,]  [A+B^l  1 
is  real  and  satisfies 


-1  +  c4hp/2  -  («-l)c6h_p  i  A  i  1  -  c5h2"p. 


(4.12) 


if  «*  2.  1,  and 


-1  +  c4hp/2  i  A  i  1  -  c5h2_p  +  (l-*e)c6h_P, 


(4.13) 


if  •  i  1,  where  c4>  are  positive  constants. 


Proof:  Te  prove  this  resnlt  for  [(1-w)A+BjJ [A+Bj]  only,  as  the 


proof  for  Kl-wlA+BjjHA+Bj]  is  sisiilar. 


Since  A,  B^,  and  B2  are  syawetric  and  “  A+E„  is  positive- 

definite,  [(I-wJA+B^HA+BjJ  1  ia  sisiilar  to  the  syawetric  natrix 
—1/2  -1/2 

[A+Bj]  Kl-aJA+Bj]  IA+Bjj]  Consequently,  the  eigenvalnes  of 

Kl-aJA+BjHA+Bj]  2  are  real.  Moreover,  for  x  ■  [A+Bjl  2^2y,  y  M, 
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gA+B2r1/2[(l^)A+B1J[A+B2r1/2y.y)  ([(l-aU+B^n,*) 


(y»y) 


(IA+B2]x.x) 


whence  any  eigenvalue  X  of  [(I-wJA+BjHA+Bj]  2  satisfies 


( [ (1-*)A+B  ] x,x)  ([(l-wJA+B-lx.x) 

A _ _ _  ✓  \  y  ,  .  .  .  , .  J-  .  . 

*+o  /riJ„  .  1  1  x^o 


( [A+B2Jx,x) 


<IA+B2]x.x) 


In  addition,  since  B^  *  B^+Dj,  B2  «  B2+D2,  and,  by  Assuaption  1, 


D1  "  °2  "  D* 


( [ (l-a)A+B^]x,x)  (l-w)  (Ax,x)+(B2x,x)+(Dx,x) 


( [A+Bjlx.x) 


(Ax,x)+(B2x,x)+(Dx,x) 


(4.14) 


Thns,  to  verify  that  inequalities  (4.12)  and  (4.13)  hold,  it  is  sufficient 
to  develop  npper  and  lover  blinds  for  the  right  side  of  (4.14),  where, 
throughout  this  proof,  we  assuae  x  j*  0. 


Since  *  ^+®2  i#  inite,  (Ax,x)+(B2x,x)+(Dx,x)  >  0. 

Therefore,  if  (1-a) (Ax,x)+(l2x,x)+(Dx,x)  £  0,  then 

(l-a)(Ax,x)+(B x,x)+(Dx,x)  * 

- -  i  0  i  1  -  c,h2'P 

(Ax,x)+(B2x,x)+(Dx,x) 

for  c^  sufficiently  saall,  as  h  is  bounded  above  in  any  discretisation  of 

Q.  On  the  other  hand,  if  (l-t*)(Ax,x)+(B2x,x)+(Dx,x)  2.  0*  then 

(l-a)(Ax,x)+(B1x,x)'«-(Dx,x)  (5  x,x)  ..  . 

1  1  (Dx,x)  +  (Dx,x)r 


(Ax,x)+(B„x,x)+(Dx,x) 


(4.15) 
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since,  by  Leans*  4.2  end  4.4,  (Ax,x)+(B2x,x)  >  0.  By  Assnaptions  1-2  and 
the  assnaptions  on  a^,  a^ ,  and  q,  there  exist  positive  constants  a  and  N 
such  that 


ah  2+P(x,x)  £  (Dx, x)  £  Mh  2+P(x,x)» 


(4.16) 


whence ,  by  Leaaa  4.4, 


(B,  x,x) 


(Dx,x)  5 


£  -c,h2-p 


for  c^  £  Cj/B.  Fnrtheraore,  fro*  Leaaa  4.3  and  the  definition  of 


Uuuil  <  h-p 
(Dx,x)  1  C611 


for  c^  2.  2/Cq.  Hence,  if  e  2.  1,  then 


(1-a) (Ax,x)+(B^x,x)+(Dx.x) 
(Ax, x) +( B^x , x) +(Dx, x) 
and,  if  a  £  1,  then 

(1-a) (Ax,x)+(B,  x, x)+(Dx,x) 


(Ax,x)+(B2x,x)+(Dx,x) 


£  1  -  c5h2"p. 


£  1  -  Cjh2  p  +  (l-a)Cgh  p. 


showing  that  the  upper  bounds  for  inequalities  (4.12)  and  (4.13)  are  valid. 


To  verify  the  lower  bounds,  consider  two  cases  depending  npon  whether 


(Dx,x)  >  (Ax,x).  If  (Dx,x)  >  (Ax,x),  then 


(B^x,x)  +  (Dx,x)  >0,  i  ■  1,2, 


by  Leaaas  4.2  and  4.4,  whence 


( 1-a) (Ax, x) +( B. x , x) +( Dx , x) 


im 


(Ax,x)+(B2x,x)+(Dx,x) 


(Ax,x)+(B2x,x)+(Dx,x) 


Therefore,  if  *  £  1,  then 


(l-»)(Ax,x)+(B..x,x)+(Dx,x) 

- - - A -  0, 

(Ax,x)+(B2x,x)+(Dx,x) 

end,  if  «  1  1,  then 

(l-«)(Ax»x)+(B1x,x)+{Dx.x) 

- - - 4 -  l  -(w-1) . 

(Ax,x)+(B, ^x,x)+(Dx,x) 

Thne,  the  lover  bonndt  in  the  theorem  ere  setisfied  in  this  cese  provided 
thet  0^  is  sufficiently  lerge,  since  h  is  bonnded  ebove  in  eny 
discretizetion  of  0. 


On  the  other  hend,  if  (Dx,x)  £  (Ax,x),  then 

_ (Ax.x) _  ^  1. 

(Ax,x)+(B2x,x)+(Dx,x)  2 

Furthermore,  by  Learns  4.5, 

(Bjx.x)  l  -<l-c3hp/2)(Ax.x)  -  (B2x,x), 


whence 


(1-m)  (Ax,x)+(B2x,x)+(Dx,x) 
(Ax,x)+(B2x,x)+(Dx,x) 


-(Ax,x)-(B2x,x)+(Dx,x)+(l-m+c3hp^2)(Ax,x) 
(Ax,x)+(B2x,x)+(Dx,x) 


l-l  +  (l-m+c3hp/2) 


(Ax,x)+(B2x,x)+(Dx,x) 


Consequently,  by  (4.17),  if  a  £  1,  then 


(l-«)(Ax,x)+(B1x,x)+(Dx,x)  j m 

-■  ■—  — ■  —  -  ...  . .  ^  ^  g 

(Ax,x)+(B2x,x)+(Dx,x)  4 


(4.17) 
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for  c4  £  c^/2 ,  and,  if  *11,  then 

(1-w)  (Ai,i)+(Bi,x)+(Di,x)  . 

- - - 1 -  1  -1  +  c.hp/2  -  (W-l)c,h"p. 

(Ax,x)+(B2x,x)+(Dx,x) 

showing  that  the  lower  bounds  for  inequalities  (4.12)  and  (4.13)  are  valid 

O.E.D 

If  oar  objective  is  to  nininize  p(  I  (I-mM+B^J  [A+B^]  *)  and 
p( I(l-w)A+B_] [A+B, ]  1 )  in  the  hope  that  this  will  minimize  p(I-uM  2A)  and 
lead  to  an  effective  stationary  iteration,  then,  based  npon  equations 

4 

(4.12)  and  (4.13),  we  should  take  u  =  1  and  p  =  j.  For  future  reference, 
we  restate  Theorem  4.7  for  these  particular  values  of  to  and  p. 


1. 

2. 


3. 


Theorem  4.8:  Assuaie  that 

*  -  1. 

**  Cgh*^2  for  some  positive  constant  c^, 

8^  i  <L  T#  for  some  constant  y#  <  •  independent  of  the  stepsize. 


h,  and 


4.  a. 


*2  ’ 


Then  any  eigenvalue  A  of  either  B^tA+B^J  2  or  B^fA+B^J  2  satisfies 


i  *  -  *2/3  /  ,  ,  ,  .2/3 

-1  +  c.h  i  1  i  1  ■  c-h  , 


whence 

p(B1lA+B23"1)  i  1  -  c?h2/3 

and 
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P<B2IA+B1]"1)  i  1  -  c?h2/3. 
where  c^  -  nin(c^,c^}  >  0. 

5.  York  Estimates:  Comjeetares  amd  Diseassioa. 

In  this  section,  using  the  eigenvslne  estimates  froa  the  previous 

section,  we  develop  conjectures  that  both  p(I-M^2A)  and  p(I-S^A)  are 
2/3 

bounded  by  1-ch  ,  for  soae  positive  constant  c.  If  the  conjecture  for  the 

SAD-DKR  factorization  is  valid,  then  the  nuaber  of  iterations  of  (1.5) 

required  to  reduce  the  A- nor*  of  the  initial  error  by  a  factor  of  s  is 
-2/3  1 

0(h  log-)  with  the  associated  nuaber  of  arithmetic  operations  being 

-2-  i* 

0(h  31og-).  Moreover,  if  (1.5)  is  accelerated  by  the  Chebyshev  or 

conjugate  gradient  techniques,  then  the  nuaber  of  iterations  is  decreased 
—1/3  1 

to  0(h  log-)  with  the  associated  nuaber  of  srithaetic  operations  being 
-2!  1 

0(h  31og-).  If  additional  conjectures  concerning  the  spectral  structure 

of  M^A  hold,  then,  for  the  AD-DKR  factorization,  similar  work  estiaates 
are  valid  for  the  stationary  iteration  (1.5)  and  its  Chebyshev 
acceleration.  Although  the  work  estiaates  in  this  section  are  not 
rigorous,  numerical  results  presented  in  the  next  section  strongly  support 
our  conjecture  that  they  are  accurate. 

Ve  begin  by  stating  the  two  fundaaental  conjectures  about  p(I-M^A) 
and  p(I-S11A)  upon  which  the  work  estiaates  in  this  section  are  based. 

Conjecture  5*1:  If  the  assumptions  of  Theorem  4.8  hold,  then 
p(I-M^*A)  £  1-c^h2^3.  Moreover,  the  eigenvalues  of  M^*A  lie  in  a  very 
eocentric  ellipse,  the  aajor-axis  of  which  is  contained  in  the  interval 
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[c7h2^3,2-c7h2^3] . 


Discussion:  If  and  are  nomal  matrices,  then 

p<c1c2)  i  *c1c2ll  i  IICjll -llc2ll  =  pfC^pCC^. 

Hence,  if  B^IA+Bj]  2  and  B2[A+B^]  2  were  nomal,  then 

p(i-m“2a)  -  p([a+b1]"2b1[a+b2i“2b2) 

-  p(B1[A+B2]“2B2[A+B1]"2) 

1  p(B1[A+B2]“1)p(B2[A+B1]_1), 

and  the  firat  statement  of  the  conjecture  wonld  follow  from  Theorem  4.8. 
Moreover,  if  were  syaaetric,  then  the  eigenvalnea  of  M^2A  would  be  real 
and  wonld  lie  in  the  interval  [c7h2^3,2-c7h2^3] . 

The  conjecture  ia  baaed  upon  the  obaervation  that,  under  the 
assuaptions  of  Theoreai  4.8,  each  of  BjlA+B^J  2,  B^ [A+B^]  ,  and  M!  iB 

'alaoat  syaaetric'  in  the  interior  of  the  grid  0^,  by  which  we  aean,  for 
ezaaple,  that 

(B1CA+a2i"1w)j  k  -  <fA+B21-lBiw)j,k  <5.1> 

whenever  the  grid-point  (jh,kh)  ia  not  'too  cloae'  to  dQ.  .  Thia  follows 

A 

froa  a  aiaple  calculation  that  ahowa  that  the  aatricea  B^B^ ,  AB^,  AB^ ,  DB^ , 
and  DB2  'alaoat  coaaute'  in  the  interior  of  the  grid  0^.  However,  if 
(jh,kh)  ia  'cloae'  to  dQ^,  then  (S.l)  ia  a  very  poor  approxiaation. 

Although  it  ia  posaible  to  be  aore  specific  about  what  we  aean  by  'alaoat 


symmetric',  this  has  not  lead  os  to  a  more  convincing  justification  of  the 
conjectare.  Therefore,  we  do  not  pnrsne  this  arguaent  farther  at  this 
tiae. 


Conjeetore  5.2:  If  the  assoaptions  of  Theorea  4.8  hold,  then 
pd-Sj^A)  i  1-c.yh2^3  and  the  eigenvalues  of  S^*A  lie  in  the  interval 
Ic7h2^3,2-c?h2^3l . 


Disenaaion:  If  and  C^  are  noraal  matrices,  then 

p(C X+C2)  £  IIC^H  £  IICjll  +  flC2H  =  p(C1)  +  p(C2). 

In  addition,  if  Conjecture  5.1  holds,  then  pU-M^A)  £  l-c?h2/3;  the 

— t  2/3 

conjecture  that  pd-M^  A)  £  1-c^h  can  be  defended  in  a  similar  manner. 
Hence,  if  I-M^A  and  I-N^tA  were  normal,  then 

pd-S^A)  £  ipd-M^A)  +  ipd-^A)  £  l-c?h2/3.  (5.2) 


Furtheraore,  since  S, 


Hence,  if  (5.2)  holds. 


.2/3 


u2/3. 


[c_r'*',2-c_h*"]. 


is  syaaetric,  the  eigenvalues  of  S^A  are  real, 
then  the  eigenvalues  of  S^A  lie  in  the  interval 


Although  I-M^A  and  I-M^A  are  not  in  general  noraal,  they  are 
'alaost  syaaetric'  in  the  interior  of  the  grid  0^  in  the  sense  used  in  the 
discussion  following  Conjecture  5.1. 


Theerea  5,3:  If  the  assoaptions  of  Theorea  4.8  hold  and 


Conjecture  5.2  is  valid,  then,  for  the  SAIMHOl  factorisation,  the  nuaber  of 
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iterations  of  (l.S)  required  to  reduce  the  A- norm  of  the  initial  error  by  a 
—2/3  l 

factor  of  s  is  0(h  log-)  snd  the  associated  nuaber  of  arithaetic 
_22  1  * 

operations  is  0(h  31og^).  Moreover,  if  the  iteration  (1.5)  is  accelerated 

by  the  Chebyshev  or  conjugate  gradient  techniques,  then  the  nuaber  of 

-1/3  1 

iterations  is  decreased  to  0(h  log-)  and  the  associated  nuaber  of 

arithaetic  operations  is  0(h  31og-). 

8 

Proof:  If  the  assuaptions  of  Theorea  4.8  hold  and  Conjecture  5.2  is 
valid,  then  p  =  p(I-S^A)  £  1-c.jh2^3.  Moreover,  A^^S^A1^2  is  noraal, 
since  is  syaaetric.  Hence,  by  Leoaa  3.1,  the  nuaber  of  iterations  of 
(1.5)  required  to  reduce  the  A- norm  of  the  initial  error  by  a  factor  of  e 
is  at  aost  n+1,  where 

n  *  log  -  /  log  -  *=  0(h  2^3log-). 

8  p  8 

Moreover,  is  syaaetric  and  the  eigenvalues  of  S^A  lie  in  the  interval 
2/3  2/3 

[Cjh  ,2-a jh  ].  Hence,  if  the  iteration  (1.5)  is  accelerated  by  the 

Chebyshev  or  conjugate  gradient  technique,  then,  by  Leaaas  3.2  and  3.3,  the 
nuaber  of  iterationa  of  (1.5)  required  to  reduce  the  A-nora  of  the  initial 
error  by  a  factor  of  a  is  at  aost  n+1,  where 

n  **  log  -  /  log  -  ■  0(h  1/3logi), 
t  r  6 

2/3 

since,  in  this  case,  a  “  1-c^h  and 
1  .  2  ltcbl/3 

for  soae  positive  constant  c. 
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Since,  for  the  SAD-DZR  factorization,  the  nnaber  of  aultiplies  needed 

to  perfora  one  iteration  of  (l.S)  or  its  Chebyshev  or  conjugate  gradient 

acceleration  ia  proportional  to  the  nnaber  of  grid-points  in  the 

_2 

discretisation,  the  nnaber  of  anltiplies  per  iteration  is  0(h  ).  Bence, 

the  work  estiaates  follow  iaaediately  froa  the  bounds  on  the  nnaber  of 
iterations.  Q.B.D. 


For  the  AD-DKK  factorization,  the  work  estiaates  are  coaplicated 
slightly  by  the  appearanoe  of  the  constants  c  and  q  in  equations  (3.3)  and 
(3.8)  and  the  constant  b  in  the  ezpresaion  for  r  (3.9).  Clearly,  these 
constants  depend  upon  the  aatrices  and  A  and,  consequently,  aay  grow  as 
h->0.  However ,  if  they  do  not  grow  'too  fast'  as  h->0,  a  result  siailar  to 
Theorea  5.3  holds  for  the  A D-DKR  factorization  as  well. 

Theorea  5.4:  If 

1.  the  assuaptiona  of  Theorea  4.8  hold, 

2.  Conjecture  5.1  is  valid,  and 

3.  the  constants  c  and  q  that  appear  in  the  inequality  (3.3)  satisfy 
o  •  0(c  k)  and  q  £  Q,  for  soae  constants  k  and  Q  independent  of  h, 

then,  for  the  AD-DKR  factorization,  the  nnaber  of  iterations  of  (1.5) 

required  to  reduce  the  A-nora  of  the  initial  error  by  a  factor  of  e  is 
—2/3  1 

0(h  log-)  and  the  associated  nuaber  of  arithaetio  operations  is 
_2-  1 

0(h  31og^).  Moreover,  if  the  iteration  (1.5)  ia  accelerated  by  the 
Chebyshev  technique  and  Assuaption  3  is  replaced  by 
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3.  the  constants  c  and  q  that  appear  in  the  inequality  (3.8) 
satisfy  c  “  0(e  *)  and  q  £  Q,  for  soae  constants  k  and  Q 
independent  of  h,  and 

4.  the  constant  b  that  appears  in  the  expression  for  r  (3.9) 
satisfies  b  =  0(h1/3), 

-1/3  1 

then  the  number  of  iterations  is  decreased  to  0(h  log-)  with  the 

_2i  “  1 

associated  nnaber  of  arithaetic  operations  being  0(h  31og-). 

Proof:  If  the  assnaptions  of  Theorea  4.8  hold  and  Conjecture  5.1  is 
▼slid,  then 

P  -  pd-M^A)  i  l-c?h2/3  (5.3) 

for  soae  positive  oonstant  c^.  By  Leaaa  3.1,  the  nuaber  of  iterations  of 
(1.5)  required  to  reduce  the  A-nora  of  the  initial  error  by  a  factor  of  e 
is  at  aost  n+1,  where 

(n-q)  log  ^  -  log  I* j  ■  log  j  +  log  c 

Therefore,  by  Assuaption  3,  and  (5.3),  n  £  a,  where 

(ar-Q)c7h2^3  -  Q  log  a  *  (k+1)  log 

-2/3  1 

for  soae  constants  Q,  k,  and  C  independent  of  h,  whence  n  “  0(h  fog-). 

By  Assuaptions  1,  2,  and  4,  the  eigenvalues  of  M^*A  lie  in  the 
ellipse 

■  ■(seC:i»l-a  cos  0  +  i  b  sin  9,  0  i  0  i  2h  ), 
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2/3  1/3 

where  a  ■  l-c^h  '  end  b  *  0(h  '  ),  whence 

i  .  >*;67?  , 

for  soae  positive  constant  c.  Therefore,  by  Assnaption  3,  Lease  3.2,  and  an 
arguaent  siailar  to  the  one  nsed  above  for  the  stationary  iteration,  if  the 
iteration  (l.S)  is  accelerated  by  the  Chebyshev  technique  then  the  nnaber 
of  iterations  required  to  reduce  the  initial  error  by  a  factor  of  e  is 
decreased  to  O(h*1/3logi) . 

Since,  for  the  AD-DKR  factorization,  the  nnaber  of  nultiplies  needed 

to  perfora  one  iteration  of  (l.S)  or  its  Chebyshev  acceleration  is 

proportional  to  the  nnaber  of  grid-points  in  the  discretization,  the  nnaber 

-2 

of  aultiplies  per  iteration  is  0(h  ).  Bence,  the  work  estiaates  follow 

iaaediately  froa  the  bounds  on  the  nnaber  of  iterations.  Q.E.D. 


We  have  not  been  able  to  establish  the  validity  of  Assnaptions  3,  3, 
and  4  for  the  AD-DKR  factorization,  although  we  believe  that  the  violation 
of  either  Assnaption  3  or  3  is  very  unlikely  in  practice.  On  the  other 
hand,  the  validity  of  Assnaption  4  is  questionable.  For  a  few  saaple 
probleas  with  coarse  discretizations,  we  coaputed  the  eigenvalues  of  M^3A 
and  found  aoae  of  thea  to  have  saall,  but  not  insignificant,  iaaginary 
parts.  However,  the  nuaerical  results  presented  in  the  next  section  do  not 
contradict  the  conclusion  of  Theorea  5.4,  which  lends  support  to  our  belief 
that  the  asauaptiona  on  which  the  theorea  is  based  nay  be  valid  as  well. 

Finally,  we  re-eaphasize  that  the  class  of  probleas  of  the  fora  (1.2) 
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to  which  our  convergence  results  for  the  AD-DKR  and  SAD-DKR  factorizations 

pertain  is  essentially  the  sane  as  the  class  considered  by  Dupont,  Kendall, 

and  Rachford  IS]  for  the  DKR  factorization,  except  for  the  added 

restriction  that  a^  *=  *2 .  Experiaental  results  show  that,  if  this 

restriction  is  violated,  then  the  Alternating-Direction  technique  nay  not 

iaprove  the  rate  of  convergence  of  the  iteration  (l.£)  or  its  acceleration. 

4/3 

Furtheraore,  note  that  the  paraaeters  u  =  1  and  a  =  cnh  recoaaended 

j  #  k  0 

for  use  with  the  AD-DKR  and  SAD-DKR  factorizations  are  substantially 

2 

different  froa  the  corresponding  paraaeters  u  =  0(h)  and  u,  =  ch 
recoaaended  by  Dupont,  Kendall,  and  Rachford  [5]  for  the  DKR  factorization. 
Moreover,  experiaental  evidence  suggests  that  the  AD-DKR  and  SAD-DKR 
factorizations  do  not  achieve  the  substantially  iaproved  rates  of 
convergence  that  we  have  observed  if  the  paraaeters  recoaaended  for  the  DKR 
factorization  are  used.  A  aore  coaplete  discussion  of  these  observations 
is  given  in  [3] . 


6 .  Nwaerieal  Results. 

In  this  section,  we  present  soae  nuaerical  results  that  support  the 
conjectures  of  the  previous  section. 

For  this  experiaent,  we  chose  the  Dirichlet  problem  with  homogeneous 
boundary  conditions  for  the  two-dinensional  elliptio  equation  (1.2)  with 
coefficients 

»j(*.y)  ■  Sj(x.y)  ■  e*7,  q(x.y)  *  -l/(l+x+y) 
on  the  L- shaped  doaain  0  having  vertices  (0,0),  (1,0),  (1,|),  (|,|),  (|,1), 
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(0,1).  The  domain  was  discretized  with  N+l  evenly  spaced  grid-lines  in 
each  direction;  h  =  For  N  =  10,  20,  30,. ..,90,  we  discretized  (1.2) 
using  the  standard  five-point  operator  described  in  Section  2.  Ve  conpnted 
r,  the  right  side  of  the  resulting  system  of  linear  eqnations  (1.1),  so 
that  the  system  has  the  solution 

»j,k  =  xj(5-xj)(l-xj)yk(j-yk)(l-yk), 

where 

Xj  =  jh  and  yk  =  kh. 

Starting  from  an  initial  guess  of  zero,  we  solved  (1.1)  by  the  iterative 
methods  discussed  in  the  previous  section.  Also  included  for  comparison  is 
the  conjugate  gradient  acceleration  of  (1.5)  based  npon  the  DKR 
factorization.  In  each  case,  we  recorded  the  number  of  iterations  required 
to  reduce  the  A- norm  of  the  initial  error  by  a  factor  of  e  =  10 

In  Figure  6-1,  the  number  of  iterations  required  to  solve  (1.1)  to 
the  specified  accuracy  are  listed  for  the  methods 

1.  SIN,  the  stationary  iteration  (1.5)  based  upon  the  nonsymmetric 

4/3 

AD-DKR  factorization  M  with  a.  v  =  h  and  iteration  parameter 
u  ■  1, 

2.  SIS,  the  stationary  iteration  (1.5)  based  upon  the  symmetric  SAD-DKR 

4/3 

factorization  S..  with  a.  .  «  h  and  iteration  parameter  <■>  =  1, 

1  J#» 

3.  CHN,  the  Chebyshev  acceleration  of  the  stationary  iteration  (1.5) 

4/3 

based  upon  the  nonsymmetric  AD-DKR  factorization  M.  with  a.  .  ■  h 
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and  iteration  parameters  chosen  to  minimize  P  (z)  on  the  interval 

n 


4.  CSS,  the  Chebyshev  acceleration  of  the  stationary  iteration  (1.5) 


based  npon  the  symetric  SAD-DKR  factorization  S1  with  a.  =  h  ' 

1  J  »k 

and  iteration  parameters  chosen  to  minimize  P  (z)  on  the  interval 
[h2/3,2-h2/3]. 


5.  CGS,  the  conjugate  gradient  acceleration  of  the  stationary  iteration 

(1.5)  based  npon  the  symmetric  SAD-DKR  factorization  S^  with 
.4/3 

a  .  =  h  ,  and 
J»* 

6.  CODER,  the  conjugate  gradient  acceleration  of  the  stationary 

iteration  (1.5)  based  upon  the  DKR  factorization  with  a  =  h' . 

J  »* 

For  each  method,  both  the  modified  (M)  and  unmodified  (ON)  DKR 
factorizations  were  used.  Also  listed  in  the  last  two  lines  of  Figure  6-1 
are  the  expected  rate  of  convergence,  E,  and  the  observed  rate,  R,  where  R 
is  computed  by  a  least  squares  fit  to 

log  N  **  R  log  (NUMBER  OF  ITERATIONS)  +  C 

for  N  -  30.40.. ..,90. 


For  each  of  the  methods,  the  numerical  results  for  the  modified  and 
unmodified  DKR  factorizations  are  almost  identical.  Consequently,  we  have 
plotted  the  number  of  iterations  for  the  methods  based  upon  the  unmodified 
DKR  factorization  only  in  Figures  6-2  and  6-3.  The  CODER  method  is 
included  in  each  graph  for  comparison. 
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Figure  6-1:  The  number  of  iterations  required  to  reduce  the  A- norm  of  the 
error  by  a  factor  of  a  «=  10  for  the  stationary  iteration 
(1.5)  and  its  Chebyshev  and  conjugate  gradient  accelerations  based 
upon  the  AD-DER,  SAD-DKK,  and  DKK  factorizations. 
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Vigare  <-2:  The  noaber  of  iteratioas  required  by  the  Methods  SIS  (1), 
CBS  (2),  COS  (3),  sad  COBOL  (4)  to  reduee_tbe  A-nora  of  tbo 
error  by  i  faotor  of  •  »  10  . 
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Fif«*«  H:  Tha  noabar  of  iterations  reqsirad  by  tba  aathodt  SIN  (1), 
CBN  (2)*  u4  COm  (3)  to  rodaea  tba  Arson  of  tbo  arror  by  •  factor  of 

a  -  10  . 
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The  rate  of  convergence  of  the  methods,  with  the  possible  exception 
of  CBN,  agrees  very  well  with  the  rate  predicted  by  the  analysis  in  the 
previous  section.  The  reason  for  the  discrepancy  for  CHN  is  not  clear  to 
ns,  bnt  it  coaid  be  that  Assumption  4  of  Theorem  5.4  is  violated  or  that 
the  parameters  that  we  chose  for  the  Chebyshev  iteration  are  not  optimal. 
This  question  requires  further  investigation. 

Although  the  principal  aim  of  this  paper  is  to  present  asymptotic 

work  estimates  for  several  ADIF  methods  and  not  to  compare  the  efficiency 

of  various  algorithms  for  solving  (1.1),  we  conclude  with  a  few 

observations  about  the  efficiency  of  CGS.  Even  on  coarse  grids,  the  number 

of  iterations  required  to  solve  this  test  problem  by  CGS  is  about  half  the 

number  required  by  CGDKR.  Moreover,  this  ratio  decreases  with  N,  as  the 

theory  predicts.  However,  straightforward  implementations  of  CGS  and  CGDKR 
2  2 

require  16(N-1)  and  44(N-1)  ,  respectively,  multipljr-adds  per  iteration. 
Hence,  for  these  implementations,  this  problem,  and  the  grids  considered, 
CGDKR  requires  less  work  than  CGS  to  solve  the  problem.  Bnt  the  relative 
efficiency  of  these  two  methods  is  problem  dependent:  for  the  Laplacian  on 
a  unit  square  with  the  same  sequence  of  grids  and  implementations,  we  found 
that  CGS  requires  slightly  less  work  than  CGDKR  on  the  fine  grids.  In 

2 

addition,  Eisenstat  [6]  has  shown  that  CGDKR  can  be  implemented  in  10(N-1) 
multiple-adds  per  iteration.  Some  of  his  techniques  are  applicable  to  CGS 
as  well,  and  it  is  our  hope  that  the  work  per  iteration  for  this  method  can 
be  significantly  reduced.  Ve  intend  to  oonsider  the  question  of  efficient 
implementation  of  ADIF  methods  in  [3],  as  well  as  the  comparison  of  these 


methods  with  others 
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